CaKernel - A GPGPU Kernel Asbtraction and Implementation for Scientific Computing on Heterogeneous Systems

نویسندگان

Jian Tao

Steven R. Brandt

Marek Blazewicz

چکیده

We presented our work to design and implement a GPGPU kernel abstraction, which is suitable for developing highly efficient large scale scientific applications using stencil computations on hybrid CPU/GPU systems. By leveraging the MPI-based data parallelism implemented in Cactus, we have developed a CaKernel programming framework in the CUDA/OpenCL architecture to facilitate the development process by automatically generating the highly optimized CUDA/OpenCL code of all declared kernel functions from a kernel descriptor, a set of computation templates, and a code generator. This kernel abstraction implementation has been tested and benchmarked with a 3D CFD implementation based on a finite difference discretization of Navier-Stokes equations. Our current effort is focused on minimizing the costs of the data exchange between GPU and CPU and optimizing the boundary exchange. Further integration in this area may improve performance and scalability.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Computer Vision Accelerators for Mobile Systems based on OpenCL GPGPU Co-Processing

In this paper, we present an OpenCL-based heterogeneous implementation of a computer vision algorithm – image inpainting-based object removal algorithm – on mobile devices. To take advantage of the computation power of the mobile processor, the algorithm workflow is partitioned between the CPU and the GPU based on the profiling results on mobile devices, so that the computationally-intensive ke...

متن کامل

Exploring Multi-level Parallelism for Large-Scale Spiking Neural Networks

Several biologically inspired applications have been motivated by Spiking Neural Networks (SNNs) such as the Hodgkin-Huxley (HH) and Izhikevich models, owing to their high biological accuracy. The inherent massively parallel nature of the SNN simulations makes them a good fit for heterogeneous computing resources such as the General Purpose Graphical Processing Unit (GPGPU) clusters. In this re...

متن کامل

Accelerating Data-Serial Applications on Data-Parallel GPGPUs: A Systems Approach

The general-purpose graphics processing unit (GPGPU) continues to make significant strides in high-end computing by delivering unprecedented performance at a commodity price. However, the many-core architecture of the GPGPU currently allows only data-parallel applications to extract the full potential out of the hardware. Applications that require frequent synchronization during their execution...

متن کامل

An Efficient Genetic Algorithm for Task Scheduling on Heterogeneous Computing Systems Based on TRIZ

An efficient assignment and scheduling of tasks is one of the key elements in effective utilization of heterogeneous multiprocessor systems. The task scheduling problem has been proven to be NP-hard is the reason why we used meta-heuristic methods for finding a suboptimal schedule. In this paper we proposed a new approach using TRIZ (specially 40 inventive principles). The basic idea of thi...

متن کامل

An Efficient Genetic Algorithm for Task Scheduling on Heterogeneous Computing Systems Based on TRIZ

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

CaKernel - A GPGPU Kernel Asbtraction and Implementation for Scientific Computing on Heterogeneous Systems

نویسندگان

چکیده

منابع مشابه

Computer Vision Accelerators for Mobile Systems based on OpenCL GPGPU Co-Processing

Exploring Multi-level Parallelism for Large-Scale Spiking Neural Networks

Accelerating Data-Serial Applications on Data-Parallel GPGPUs: A Systems Approach

An Efficient Genetic Algorithm for Task Scheduling on Heterogeneous Computing Systems Based on TRIZ

An Efficient Genetic Algorithm for Task Scheduling on Heterogeneous Computing Systems Based on TRIZ

عنوان ژورنال:

اشتراک گذاری